Goto

Collaborating Authors

 clinical outcome



End to End AI System for Surgical Gesture Sequence Recognition and Clinical Outcome Prediction

Li, Xi, Matsumoto, Nicholas, Pasupulety, Ujjwal, Deo, Atharva, Yang, Cherine, Moran, Jay, Hernandez, Miguel E., Wager, Peter, Lin, Jasmine, Kim, Jeanine, Goh, Alvin C., Wagner, Christian, Sonn, Geoffrey A., Hung, Andrew J.

arXiv.org Artificial Intelligence

Fine-grained analysis of intraoperative behavior and its impact on patient outcomes remain a longstanding challenge. We present Frame-to-Outcome (F2O), an end-to-end system that translates tissue dissection videos into gesture sequences and uncovers patterns associated with postoperative outcomes. Leveraging transformer-based spatial and temporal modeling and frame-wise classification, F2O robustly detects consecutive short (~2 seconds) gestures in the nerve-sparing step of robot-assisted radical prostatectomy (AUC: 0.80 frame-level; 0.81 video-level). F2O-derived features (gesture frequency, duration, and transitions) predicted postoperative outcomes with accuracy comparable to human annotations (0.79 vs. 0.75; overlapping 95% CI). Across 25 shared features, effect size directions were concordant with small differences (~ 0.07), and strong correlation (r = 0.96, p < 1e-14). F2O also captured key patterns linked to erectile function recovery, including prolonged tissue peeling and reduced energy use. By enabling automatic interpretable assessment, F2O establishes a foundation for data-driven surgical feedback and prospective clinical decision support.


CNN-LSTM Hybrid Model for AI-Driven Prediction of COVID-19 Severity from Spike Sequences and Clinical Data

Cheohen, Caio, Gomes, Vinnícius M. S., da Silva, Manuela L.

arXiv.org Artificial Intelligence

The COVID-19 pandemic, caused by SARS-CoV-2, highlighted the critical need for accurate prediction of disease severity to optimize healthcare resource allocation and patient management. The spike protein, which facilitates viral entry into host cells, exhibits high mutation rates, particularly in the receptor-binding domain, influencing viral pathogenicity. Artificial intelligence approaches, such as deep learning, offer promising solutions for leveraging genomic and clinical data to predict disease outcomes. Objective: This study aimed to develop a hybrid CNN-LSTM deep learning model to predict COVID-19 severity using spike protein sequences and associated clinical metadata from South American patients. Methods: We retrieved 9,570 spike protein sequences from the GISAID database, of which 3,467 met inclusion criteria after standardization. The dataset included 2,313 severe and 1,154 mild cases. A feature engineering pipeline extracted features from sequences, while demographic and clinical variables were one-hot encoded. A hybrid CNN-LSTM architecture was trained, combining CNN layers for local pattern extraction and an LSTM layer for long-term dependency modeling. Results: The model achieved an F1 score of 82.92%, ROC-AUC of 0.9084, precision of 83.56%, and recall of 82.85%, demonstrating robust classification performance. Training stabilized at 85% accuracy with minimal overfitting. The most prevalent lineages (P.1, AY.99.2) and clades (GR, GK) aligned with regional epidemiological trends, suggesting potential associations between viral genetics and clinical outcomes. Conclusion: The CNN-LSTM hybrid model effectively predicted COVID-19 severity using spike protein sequences and clinical data, highlighting the utility of AI in genomic surveillance and precision public health. Despite limitations, this approach provides a framework for early severity prediction in future outbreaks.


A Computational Approach to Epilepsy Treatment: An AI-optimized Global Natural Product Prescription System

Wang, Zhixuan

arXiv.org Artificial Intelligence

Epilepsy is a prevalent neurological disease with millions of patients worldwide. Many patients have turned to alternative medicine due to the limited efficacy and side effects of conventional antiepileptic drugs. In this study, we developed a computationa l approach to optimize herbal epilepsy treatment through AI - driven analysis of global natural products and statistically validated randomized controlled trials (RCTs). Our intelligent prescription system combines machine learning (ML) algorithms for herb - e fficacy characterization, Bayesian optimization for personalized dosing, and meta - analysis of RCTs for evidence - based recommendations. The system analyzed 1,872 natural compounds from traditional Chinese medicine (TCM), A yurveda, and ethnopharmacological d atabases, integrating their bioactive properties with clinical outcomes from 48 RCTs covering 48 epilepsy conditions (n=5,216). Cohen's d=0.89) with statistical significance confirmed by multiple testing (p$<$0.001). A randomized double - blind validation trial (n=120) demonstrated 28.5 \ % greater s eizure frequency reduction with AI - optimized herbal prescriptions compared to conventional protocols (95 \ % CI: 18.7 - 37.3 \ %, p=0.003). Keywords: epilepsy, herbal medicine, computational pharmacology, AI - optimized prescription, natural products, machine learning, precision medicine, Bayesian optimization, clinical validation Introduction Despite being among the most difficult to treat neurological disorders (W orld Health Organization: WHO, 2024), it is estimated by the W orld Health Organization that there are close to 50 million people living with epilepsy (Figure 1A: Global Epilepsy Prevalence and Treatment Gaps).


Predicting Clinical Outcomes with Waveform LSTMs

Albada, Michael

arXiv.org Artificial Intelligence

Data mining and machine learning hold great potential to enable health systems to systematically use data and analytics to identify inefficiencies and best practices that improve care and reduce costs. Waveform data offers particularly detailed information on how patient health evolves over time and has the potential to significantly improve prediction accuracy on multiple benchmarks, but has been widely under-utilized, largely because of the challenges in working with these large and complex datasets. This study evaluates the potential of leveraging clinical waveform data to improve prediction accuracy on a single benchmark task: the risk of mortality in the intensive care unit. We identify significant potential from this data, beating the existing baselines for both logistic regression and deep learning models.


Foundation Model of Electronic Medical Records for Adaptive Risk Estimation

Renc, Pawel, Grzeszczyk, Michal K., Oufattole, Nassim, Goode, Deirdre, Jia, Yugang, Bieganski, Szymon, McDermott, Matthew B. A., Was, Jaroslaw, Samir, Anthony E., Cunningham, Jonathan W., Bates, David W., Sitek, Arkadiusz

arXiv.org Artificial Intelligence

We developed the Enhanced Transformer for Health Outcome Simulation (ETHOS), an AI model that tokenizes patient health timelines (PHTs) from EHRs. ETHOS predicts future PHTs using transformer-based architectures. The Adaptive Risk Estimation System (ARES) employs ETHOS to compute dynamic and personalized risk probabilities for clinician-defined critical events. ARES incorporates a personalized explainability module that identifies key clinical factors influencing risk estimates for individual patients. ARES was evaluated on the MIMIC-IV v2.2 dataset in emergency department (ED) settings, benchmarking its performance against traditional early warning systems and machine learning models. We processed 299,721 unique patients from MIMIC-IV into 285,622 PHTs, with 60% including hospital admissions. The dataset contained over 357 million tokens. ETHOS outperformed benchmark models in predicting hospital admissions, ICU admissions, and prolonged hospital stays, achieving superior AUC scores. ETHOS-based risk estimates demonstrated robustness across demographic subgroups with strong model reliability, confirmed via calibration curves. The personalized explainability module provides insights into patient-specific factors contributing to risk. ARES, powered by ETHOS, advances predictive healthcare AI by providing dynamic, real-time, and personalized risk estimation with patient-specific explainability to enhance clinician trust. Its adaptability and superior accuracy position it as a transformative tool for clinical decision-making, potentially improving patient outcomes and resource allocation in emergency and inpatient settings. We release the full code at github.com/ipolharvard/ethos-ares to facilitate future research.


CovidLLM: A Robust Large Language Model with Missing Value Adaptation and Multi-Objective Learning Strategy for Predicting Disease Severity and Clinical Outcomes in COVID-19 Patients

Zhu, Shengjun, Liu, Siyu, Li, Yang, Lei, Qing, Hou, Hongyan, Jiang, Hewei, Guo, Shujuan, Wang, Feng, Chen, Rongshang, Fan, Xionglin, Tao, Shengce, Cai, Jiaxin

arXiv.org Artificial Intelligence

Coronavirus Disease 2019 (COVID-19), which emerged in 2019, has caused millions of deaths worldwide. Although effective vaccines have been developed to mitigate severe symptoms, certain populations, particularly the elderly and those with comorbidities, remain at high risk for severe outcomes and increased mortality. Consequently, early identification of the severity and clinical outcomes of the disease in these patients is vital to prevent adverse prognoses. Although traditional machine learning and deep learning models have been widely employed in this area, the potential of large language models (LLMs) remains largely unexplored. Our research focuses primarily on constructing specialized prompts and adopting multi-objective learning strategies. We started by selecting serological indicators that significantly correlate with clinical outcomes and disease severity to serve as input data for the model. Blood test samples often contain numerous missing values, and traditional models generally rely on imputation to handle these gaps in the data. In contrast, LLMs offer the advantage of robust semantic understanding. By setting prompts, we can explicitly inform the model when a feature's value is missing, without the need for imputation. For the multi-objective learning strategy, the model is designed to first predict disease severity and then predict clinical outcomes. Given that LLMs utilize both the input text and the generated tokens as input for generating the next token, the predicted severity is used as a basis for generating the clinical outcome. During the fine-tuning of the LLM, the two objectives influence and improve each other. Our experiments were implemented based on the ChatGLM model. The results demonstrate the effectiveness of LLMs in this task, suggesting promising potential for further development.


Uncertainty Quantification for Clinical Outcome Predictions with (Large) Language Models

Chen, Zizhang, Li, Peizhao, Dong, Xiaomeng, Hong, Pengyu

arXiv.org Artificial Intelligence

Language models, such as [1, 2, 3] have emerged to be an efficient tool in the domain of EHR tasks. These models, extensively trained on diverse sources of clinical data, such as physician notes and longitudinal medical codes, have demonstrated remarkable effectiveness in predicting clinical outcomes. Despite their capabilities, measuring and reducing the uncertainties of these models in EHR tasks is crucial for ensuring patient safety, as clinicians can avoid interventions that the model indicates are uncertain and potentially hazardous. In addition, quantifying the uncertainties in clinical tasks can enhance the reliability of AI-driven medical decision-making systems [4]. To address this challenge, leveraging the transparency of model parameters, we utilize established uncertainty metrics and propose to combine them with ensembling and multi-tasking approaches to effectively quantify and mitigate uncertainties in EHR tasks for these white-box language models. Recently, large language models have embarked on demonstrating their utility in clinical-related tasks, including EHR prediction tasks [5], analyzing radiology report examinations [6] and medical reasoning [7]. However, the encapsulation of modern Large Language Models, typically offered as API services with restricted access to internal model parameters and prediction probabilities, impedes the direct application of traditional uncertainty quantification methods. To overcome this limitation, We redefine uncertainty quantification as a post-hoc approach by analyzing the distribution of answers generated repeatedly from our designed prompts for clinical prediction tasks. Inspired by the effectiveness of our proposed methods in reducing model uncertainty for white-box LMs, we adapted and applied ensembling and multi-tasking methods to the black-box settings.


From Glucose Patterns to Health Outcomes: A Generalizable Foundation Model for Continuous Glucose Monitor Data Analysis

Lutsker, Guy, Sapir, Gal, Godneva, Anastasia, Shilo, Smadar, Greenfield, Jerry R, Samocha-Bonet, Dorit, Mannor, Shie, Meirom, Eli, Chechik, Gal, Rossman, Hagai, Segal, Eran

arXiv.org Artificial Intelligence

Recent advances in self-supervised learning enabled novel medical AI models, known as foundation models (FMs) that offer great potential for characterizing health from diverse biomedical data. Continuous glucose monitoring (CGM) provides rich, temporal data on glycemic patterns, but its full potential for predicting broader health outcomes remains underutilized. Here, we present GluFormer, a generative foundation model on biomedical temporal data based on a transformer architecture, and trained on over 10 million CGM measurements from 10,812 non-diabetic individuals. We tokenized the CGM training data and trained GluFormer using next token prediction in a generative, autoregressive manner. We demonstrate that GluFormer generalizes effectively to 15 different external datasets, including 4936 individuals across 5 different geographical regions, 6 different CGM devices, and several metabolic disorders, including normoglycemic, prediabetic, and diabetic populations, as well as those with gestational diabetes and obesity. GluFormer produces embeddings which outperform traditional CGM analysis tools, and achieves high Pearson correlations in predicting clinical parameters such as HbA1c, liver-related parameters, blood lipids, and sleep-related indices. Notably, GluFormer can also predict onset of future health outcomes even 4 years in advance. We also show that CGM embeddings from pre-intervention periods in Randomized Clinical Trials (RCTs) outperform other methods in predicting primary and secondary outcomes. When integrating dietary data into GluFormer, we show that the enhanced model can accurately generate CGM data based only on dietary intake data, simulate outcomes of dietary interventions, and predict individual responses to specific foods. Overall, we show that GluFormer accurately predicts health outcomes which generalize across different populations metabolic conditions.


Patient-centered data science: an integrative framework for evaluating and predicting clinical outcomes in the digital health era

Amoei, Mohsen, Poenaru, Dan

arXiv.org Artificial Intelligence

This study proposes a novel, integrative framework for patient-centered data science in the digital health era. We developed a multidimensional model that combines traditional clinical data with patient-reported outcomes, social determinants of health, and multi-omic data to create comprehensive digital patient representations. Our framework employs a multi-agent artificial intelligence approach, utilizing various machine learning techniques including large language models, to analyze complex, longitudinal datasets. The model aims to optimize multiple patient outcomes simultaneously while addressing biases and ensuring generalizability. We demonstrate how this framework can be implemented to create a learning healthcare system that continuously refines strategies for optimal patient care. This approach has the potential to significantly improve the translation of digital health innovations into real-world clinical benefits, addressing current limitations in AI-driven healthcare models.